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Abstract. We address the problem of multircsolution module detection in dense 
weighted networks, where the modular structure is encoded in the weights rather 
than topology. We discuss a weighted version of the g-state Potts method, which 
was originally introduced by Reichardt and Bornholdt. This weighted method can be 
directly applied to dense networks. We discuss the dependence of the resolution of 
the method on its tuning parameter and network properties, using sparse and dense 
weighted networks with built-in modules as example cases. Finally, we apply the 
method to stock price correlation data, and show that the resulting modules correspond 
well to known structural properties of this correlation network. 
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1. Introduction 

During the recent years, the network approach has proven to be a very efficient way 
for investigating a wide range of complex systems [H [21 El H]. In this approach, the 
fundamental elements of the system are represented with nodes and the interactions 
between them with links. Sometimes it is enough to consider links as "binary", such 
that each link either exists or not. In this case, it is assumed that the pure topology 
carries enough relevant information about the system under study. However, valuable 
information is often lost if interaction strengths are not taken into account. Because 
of this, the study of weighted networks has recently been receiving a lot of attention. 
In this framework, a scalar weight representing the associated interaction strength is 
assigned to each link. It is evident that this additional degree of freedom somewhat 
complicates the picture, for example generalization of existing measures is not necessarily 
straightforward (see, e.g., [5]). Thus there is a need for developing new network analysis 
methods which focus on the weights instead of pure topology. 

The study of (weighted) networks has mostly focused on systems whose interaction 
structure is inherently sparse, such as air transport networks [6l [7] or social networks 
inferred from electronic communication records P, [9]. Another approach is to ffiter 
out interactions which are considered insignificantly weak, resulting in sparse network 
representations even for systems where each element interacts with each other, i.e., 
systems whose "natural" representation is a full or dense weighted network. For such 
networks, it is the interaction strengths themselves that carry the most significant 
information - the networks are constructed on the basis of the assumption that the 
strongest interactions encode the most significant properties for the system under study. 
This is the case for instance with correlation-based networks, in which the weights are 
usually related to correlations between the time series of some relevant activities of the 
nodes (see, e.g., [10]), or distance-based networks [H], in which the weights are related 
to distances between the nodes according to some relevant metric. It is evident that in 
this approach setting the proper threshold below which interactions are discarded is a 
non-trivial task. 

In addition to weighted networks, the attention of network science has recently 
been focusing on "mesoscopic" properties of networks, i.e., structures beyond the scale 
of single nodes or their immediate neighborhoods. A very important and related 
problem is the detection and study of modules or communities^, i.e., groups of nodes 
with dense internal connections and sparse connections to the rest of the network 
[T2I [T3I [Ml [T5I [T6l [TT] . a number of methods have been introduced, mostly in the 
context of binary networks. These include various modularity optimization methods 
building on the work by Newman and Girvan [12], the clique percolation method by 
Palla et al. [13j, and methods based on statistical inference [TBI [I9]. Many methods 
have been generalized to deal with weighted networks [201 [SB [221 [23]; however, e.g. 
for the clique percolation method, networks have to be fairly sparse in order for the 
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method to be applicable. Regarding the modularity optimization family of methods, it 
has been shown that there is an intrinsic resolution limit [201 1211 125] . However, a lot of 
attention has recently been given to multiresolution methods [151 EH [23l [25l [26] , which 
allow investigating modular structure at various levels of coarse-graining. 

In this work we concentrate on investigating modular structure in dense weighted 
networks, using a weighted version of the g-state Potts method by Reichardt and 
Bornholdt (RB) [151 l26]. This method is closely related to modularity optimization 
methods, and hence there is a resolution limit [20]. However, the method contains a 
tuning parameter which allows changing this limit. Although the method was originally 
introduced in the context of sparse, binary networks, edge weights can readily be taken 
into account [26]. In fact, once this is done, the networks to be analyzed need no 
longer to be sparse - hence, for example when studying stock market correlations, all 
correlation matrix elements can be taken into account and no thresholding is necessary. 

We begin by discussing the weighted RB method, deriving the required weighted 
null model, and then investigate the effect of the tuning parameter on the resolution 
of the method for networks with modular structure encoded in the weights. Then, we 
apply the method to a correlation-based network of stock return time series, i.e., a 
full correlation matrix, whose modular structure has been earlier investigated using a 
wide variety of approaches (see, e.g., [TOl [271 [281 [29]). It should be noted here that the 
multiresolution method recently introduced by Arenas et al. [2l] bears some similarity 
with the Potts method (see [25]); thus for comparison we apply it to the same data. 
Finally, we draw conclusions. 

2. The RB method 

2.1. Introduction 

Let us begin with a short introduction of the community detection method introduced 
by Reichardt and Bornholdt (RB) [151 EE]. In this method, each node is assigned to 
exactly one module, and the module indices of nodes are considered as spins of a g-state 
Potts model. The goal is to assign nodes to modules in such a way that the energy of 
the system is minimized. In the global optimum, groups of nodes with dense internal 
connections should end up having parallel spins. The Hamiltonian for the system is 
defined as: 



where Imm is the number of links inside module m, [lmm]p^j is the expected number of 
links inside module m given the null model pij, and 7 > is an adjustable parameter. 
The summation is over all modules. The null model pij denotes the probability that 
a link would exist between nodes i and j if the network was entirely random, i.e, in 
the absence of modular structure. Essentially, there are two possible choices for the 
null model: constant Pij = p, which corresponds to Erdos-Renyi networks [30], and the 
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configuration model [5], in which the degree sequence of the original network is retained 
but all links are randomly rewired, such that all correlations are lost to the extent 
allowed by the degree sequence. 

Next we briefly review the derivation of [lmm]p^j for the configuration model. 
Imagine that all the links in the network are cut in half, such that nodes have stubs 
{i.e., half-links) connected to them. Then these stubs are to be randomly reconnected 
to form full links. If two such stubs are randomly picked, the probability that both 
connect to nodes in module m is simply K^/K'^, where K is the degree sum of the 
network§ and Km the degree sum of nodes in module m. Since there are K/2 pairs of 
stubs, we get 

[Imm] = ■ 

Correspondingly, the probability that the two stubs to be connected belong to different 
modules, say m and n, is 2KmKn/K^. Thus, the expected number of links between 
modules m and n reads 

K K 

[Imn] = (3) 

Let us now address the question of weighted networks. It seems natural that 
equation ([1]) transforms to 

= - y^(Wmm - l['^mm]pij), (4) 
m 

where Wmm and [wmm]pij denote the sum of weights and expected sum of weights of 
links inside module m, respectively. Again, there are essentially two ways to define 
[wmmlpij- The approach taken in [20] is to calculate the expected number of links 
using the configuration model and to assume that each link has average weight, that 
is, [wmm] = {w)[lmm]- Howevcr, here we take another approach, which is analogous to 
the above derivation for the unweighted case and based on the ideas presented in [3T] . 
In weighted networks, the strength Si of node i is defined as the sum of the weights of 
the links attached to it. Consider dividing the strength of each node in small "stubs" 
of weight ds such that node i has Si/ds stubs emerging from it and start randomly 
connecting pairs of these stubs. This process is analogous to the above unweighted case, 
and as a result the expected sums of weights of the links inside module m and between 
modules m and n are 

[Wmm] = and [Wmn] = i^) 

respectively, where 5* = J2iLi is the strength sum of the network and Sg the strength 
sum of module q. When all links have weight Wy = 1, the above equations reduce to 
equations ([2D and ([3[). 
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Figure 1. a) A ring-like network, consisting of Nb cliques, each containing of Nc nodes. 
Link weights Wi within modules equal unity, whereas modules are joined by links of 
weight Wb < 1. b) The weighted RB method can merge consecutive cliques to larger 
modules, depending on values of the network parameters and the tuning parameter 7. 
The hierarchical structure is for illustrative purposes only. In general, the RB method 
does not yield hierarchical modules. 

2.2. Resolution of the weighted RB method for sparse and dense networks 

The RB method can be viewed as a general framework for community detection [26], 
which for the unweighted case includes the modularity optimization method as a special 
case (7 = 1 and configuration model as the null model). Recently, it was shown that the 
resolution of modularity optimization methods is intrinsically limited [21]. In particular, 
in large networks small "physical" communities cannot be resolved and thus there is a 
lower limit to the size of communities which can be detected by the method. This limit 
depends on the number of links in the network and is also inherited by the more general 
RB method [20]. However, by changing the parameter 7, the resolution of the method 
can be tuned such that small values yield large modules and vice versa. This provides 
a clear advantage over "traditional" modularity optimization, which is restricted to a 
single resolution. 

We now address the issue of resolution of the weighted RB method, beginning with 
a weighted modular network which is sparse, that is, whose average degree {k) N. 
Consider a simple case, where the N nodes are arranged into modules of constant size 
Nc, so that the number of such modules is = N/N^. Let the modules form a ring- 
like structure, as illustrated in Fig. [1], and let each module be a fully connected clique. 
Let the internal links within cliques have weight Wi = 1, and successive modules be 
connected by a single link of weight Wb, where Wb < 1. This presents perhaps the 
simplest possible modular structure for a weighted connected network. 

The community structure found by the weighted RB method corresponds to the 
global minimum of the Hamiltonian (or energy) defined in Eq. (jlj). Depending of the 
network parameters Nb, N^, and wi, as well as the tuning parameter 7, this structure 

§ The degree sum of the network is defined by K = where ki is the degree of node i. 
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may or may not correspond to the built-in modules. Let us consider two ways to group 
the built-in modules into communities: the first one is the "natural" grouping in which 
each built-in module is identified as a single community. In the second case, we take 
two successive built-in modules and consider them merged, that is, identified as one 
community. Other built-in modules are still considered as separate communities exactly 
as in the first case. Clearly, if the second grouping has smaller energy (jlj) than the first 
one, the resolution of the method is limited. A straightforward calculation shows that 
this is equal to the requirement 

r. r. 

Wmn > l[Wmn\ = 7 g (6j 

where m and n are the built-in modules to be merged, S = J2iLi again the 
strength sum of the network, and Sg the strength sum of module q. Now, Wmn = Wb, 
Sm = Sn = Nc{Nc — 1) + 2wb, and S = NbSm- Plugging these into Eq. IQ yields the 
merging condition for the example network: 

Wb>i^{N^-N, + 2wb). (7) 

Now, let the network size N increase while the module size remains constant. 
Then, as A*";, = N/Nc increases, larger and larger values of 7 are needed for obtaining 
the built-in modules. Increasing Wb makes merging easier, as expected. For Wb = I, 
Eq. (171) yields the resolution limit for the similar unweighted network studied in [20] . 

Let us now move on to a more interesting case where the network in question is 
fully connected, i.e., links exist between each node, and the modular structure is purely 
encoded in the weights. Perhaps the simplest possible structure for a fully connected 
network with modules is the case where Nb modules each consisting of Nc nodes are 
constructed such that inside the modules the links have weight Wi = 1 and links between 
nodes in different modules have weight Wb (0 < w;, < 1), see Fig. [2l Similarly to the 
above analysis for the sparse weighted network, we again consider two ways to group 
the built-in modules to communities: the "natural" grouping and the one in which 
two built-in modules are considered as a single module. Again, the method prefers the 
second grouping over the natural one if it yields smaller energy (Eq. (jlj)). The condition 
for this is again given by Eq. ([6]), but now we have Wmn = N^Wb and Sg = N^Si, where 
Si = Nc — 1 + {Nb — l)NcWb denotes the (constant) strength of the nodes. Thus, Eq. ([6]) 
is equivalent to 

7iV>6, (8) 

where the approximation is valid when A";, is large. In this case, Eq. ([H]) further 
simplifies to 7 < 1, where it should be understood that the specific merging value 7 = 1 
appears as a result of the simple structure of the example case. In a more general scope, 
the expected weight between modules [wmn] ~ N^Wb is independent of the number of 
modules A";,, i.e., network size. Thus, merging is solely controlled by 7. This is different 
from the sparse network case discussed above, where increasing system size eventually 



N^Wb > 7A^c 
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inside blocl<s 
linl< weight w,- =1 




between blocks 
link weight m'j,=0.1 



Figure 2. Left: A network consisting of A'^b = 4 blocks eacti having Nc — 10 nodes. 
Links inside blocks have weight Wi — 1 and nodes in different blocks are connected 
with links of weight Wb = 0.1. On the right is illustrated the effect of 7 on the found 
modular structure. Large values yield the physical communities while for small values 
the communities appear as one large module. If the number of blocks N^, is large 
enough, the networks size does not affect the 7 values where merging happens. 



triggers merging as the expected number and the total weight of hnks between modules 
decreases. 

Finally, we analyse the effects of a single strong link between the modules in the 
latter example case. On the basis of the above analysis, merging happens if the total 
weight between the two modules exceeds 7 [wmn], which is again of the order of N^Wb- 
For sufficiently large N^, the expected weight is so large that adding one strong link is 
not enough for merging to occur. Smaller modules are merged more easily. However, 
the resolution limit still depends only weakly on the number of modules, i.e., system 
size. This means that sweeping 7 can be used to probe communities of different sizes, 
and the suitable range of 7 values is practically independent of the system size. 

These considerations show that the resolution of the weighted RB method does 
not necessarily decrease when dense networks grow in size, unlike for sparse networks. 
However, for practical purposes, issues such as the distribution of weights both within 
and between the blocks is expected to affect the actual resolution, and the above 
examples should be viewed as illustrative only. 
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3. Example application: modules in a stock correlation network 

As a real-world example, we apply the weighted RB method to a correlation-based 
network of stock return time series. Networks of this type are of special interest as the 
correlations between asset returns are the main input in the classical and still widely 
used Markowitz portfolio optimization theory |32j. Correlations of stock returns were 
first studied from the network point of view by Mantegna [27] , who defined a correlation- 
based metric and was consequently able to identify modules that make sense also from 
the economic point of view by using the maximal spanning tree. This work has been 
extended by Bonanno et al. [281 ESI El| and Onnela et al. [35l |36], with the overall 
conclusion that there is cluster structure which corresponds well to economic sectors. 
Recently, the structure of correlation-based stock interaction networks has also been 
studied with the weighted version of the clique percolation method [22] and by spectral 
and thresholding analyses [TOl [291 123 EH] ■ 

To construct our network, we use a data set consisting of the daily closing prices of 
= 116 NYSE-traded stocks from the time period from 13-Jan-1997 to 29-Jan-2000||. 
We estimate the equal time correlation matrix of logarithmic returns by 
^ _ (r^rj) - (r,)(rj) 

where ri is a vector containing the logarithmic returns of stock i. Since there is a 
small number of elements of C which are slightly negative, we define the weights of our 
network by 

W^J = \C,J\-5,j, (10) 

which can be justified by interpreting the absolute values of correlations as measures of 
interaction strength without considering whether the interaction is positive or negative. 

Here, we take a multiresolution approach to the problem of detecting modules in 
the above matrix, and sweep the value of 7 to obtain the modules of W at multiple levels 
of resolution. For each value of 7, we assign nodes into modules such that the energy 
dl]) is minimized. Evidently, exploring all possible configurations is computationally 
impossible, so that some approximative method has to be employed. The choice 
of method naturally depends on the system size, and for very large systems, greedy 
optimization methods [391 SO] which directly look for local minima might be the only 
solution. For our case, the system is not very large, and we have chosen the simulated 
annealing approach, using single-spin flips as well as block flipping as the elementary 
Monte Carlo operations. It should be noted, however, that it cannot be guaranteed that 
the obtained energy minimum is a global one. For the RB method, there is no way 
around this problem. 

First, we have investigated the number of modules as a function of 7 (see Fig. [3^). 
For 7 ^ 0.8, all nodes are assigned to a single module. When 7 is further increased, 
the number of modules starts to rapidly increase, until finally each module corresponds 

II The length of the time series is 1000 trading days 
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Figure 3. The number of modules (a) and the sizes of the two largest modules (b) as 
a function of 7. 

to a single node. It is worth noting that no plateaus are seen in the graph, except for 
the trivial case of 7 ^ 0.8. In Ref. [21], using a related multiresolution method, such 
plateaus were shown to exist for test-case networks, corresponding to built-in hierarchical 
modules. Plateaus would hence yield "natural" choices of the tuning parameter. Their 
absence in Fig. [3^) means that there is no range of 7, which would correspond to a stable 
module configuration. However, stability of the number of modules only gives partial 
insight into the stability of the modular structure. Especially for real-world networks 
with modules of different sizes and internal weights, changes in this number may only 
reflect e.g. splitting of small, weak modules, while the strongest modules remain more 
or less stable when 7 is increased. This appears to be the case for our stock interaction 
network. Panel b) of Fig. [3] depicts the sizes of the two largest modules as a function of 
7. The sizes remain almost constant for an interval of approx. 7 G [1.4, 3], and thus the 
increase in the module number can be attributed to splitting of smaller modules. 

Next, we turn to the modules themselves. In order to visually compare the detected 
modules with known structural features of the investigated correlation matrix, we have 
utilized the maximal spanning tree (MST) method. The MST of a network or a 
matrix is a tree connecting all the N nodes with — 1 links, such that the sum of 
the link weights is maximized. Earlier, it has been shown that for stock correlation 
matrices, branches of the MST correspond well to business sectors or industries for the 
NYSE [271 [331 [341 [351 [36] as well as FTSE [41]. The typical way to categorize stocks into 
business sectors is to use the Forbes classification [12]. Panel a) in Figure [H displays 
the MST for the stock network, together with the Forbes classification. For comparison, 
we first set 7 = 1 (Fig. [lb), and color the nodes according to modules detected by the 
RB method for the full correlation matrix as above. The value 7 = 1 is of particular 
interest, as in this case the Hamiltonian of Eq.([l]) is equivalent with the weighted version 
of modularity [12]. For this value, four modules of sizes 13, 34, 34 and 35 are found. For 
each module, the majority of member nodes are also connected in the MST, and there 
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is a correspondence between the MST branches and the modules. The smallest module 
corresponds very well to the Energy sector in the Forbes classification, and the other 
modules roughly to combinations of different sectors. It should be noted here that the 
Forbes classification is an external one, i.e., it is not based on empirical observations on 
stock correlations, and thus some Forbes sectors are also relatively disjoint in the MST 
of Fig. Hi). 

Let us now change the resolution of the RB method by moving towards larger values 
of 7. Panel c) of Figure H] displays the modular structure obtained with 7 = 1.4, i.e., 
at the onset of the "plateau" regime of the two largest module sizes. Only modules 
of size larger than two are depicted by different colors, while the rest of the nodes are 
indicated by open symbols. An immediate observation is that the modules correspond 
remarkably well to the different branches of the MST and very well to the Forbes 
classification. Increasing 7 further splits the modules into smaller ones: for 7 = 2 the 
number of modules is already 58 and thus their average size is only 2. The largest 
modules, corresponding to the Energy sector and the Electric Utilities industry, are the 
last ones to break at around 7 ~ 3 and 7^4, respectively. Interestingly, the Energy 
module seems to contain a strong submodule of four nodes. This is also seen as a 
plateau in the graph depicting the size of the second-largest component (Fig. [3)d), which 
indicates that also large values of 7 can yield useful information on the modules. 

Finally, we study the correspondence between the modular structure obtained with 
the RB method and the Forbes classification to business sectors in a more quantitative 
way. We use two measures defined in Ref . : the sensitivity defined as the fraction of 
pairs of nodes classified to the same Forbes sector that are assigned to the same module 
by the RB method and, correspondingly, specificity as the fraction of pairs of nodes 
belonging to different sectors that are assigned to different modules by the RB method. 
Sensitivity and specificity are depicted in Figures [5](a) and Mjo), respectively. The 
sensitivity curve shows a sudden increase in the interval 7 G [0.8, 1.8]. The reason for its 
low initial value is the assignment of all nodes to a single module, as discussed above, 
and the increase corresponds to modules splitting into smaller units which correspond 
well to the Forbes classification. The high value of sensitivity for large 7 means that 
the relatively small modules given by the RB method are proper subsets of the Forbes 
business sectors. The specificity curve shows a decreasing trend, but its values still 
remain relatively high. This trend is explained by an increasing number of small modules 
(including modules consisting of one node only), such that nodes which belong to a 
common sector appear in different modules. Overall, the above results indicate that the 
modular structure detected by the weighted RB method corresponds well to the Forbes 
classification for a wide range of 7, and the small modules obtained at large 7 seem to 
be valid submodules of larger ones. 

For comparison, we have also carried out the above analysis using the recently 
introduced weighted mult iresolut ion method by Arenas et al. [21]. This method 
resembles the Potts approach; however, the tuning parameter 7 is replaced by the 
parameter r, which can be interpreted as representing the weight of a self- link added to 
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Figure 4. (a) The maximal spanning tree and business sectors according to Forbes 
[32j- (b) The maximal spanning tree and the modular structure for 7 = 1. Each color 
corresponds to a module, (c) The maximal spanning tree and the modular structure 
for 7 = 1.4. Modules of size larger than two are depicted by different colors and the 
rest of the nodes by empty symbols. 

each node. The number of modules, the sizes of the two largest modules, the sensitivity 
and the specificity as functions of the tuning parameter r are depicted in Fig. [61 
Comparison with Figs. [3] and [5l in which the same results for the RB method are 
shown, suggests that for the correlation matrix analyzed here, both the AFG and RB 
methods behave in a very similar manner. 

4. Conclusions 

Here we have presented, analyzed, and applied a weighted version of the g-state Potts 
model approach by Reichardt and Bornholdt [15], introducing a well-motivated null 
model for expected weights within modules. Our target has been to investigate the 
modular structure of dense weighted networks such that instead of the topology, the link 
weights determine the modules. In contrast to conventional approaches, where weights 
considered insignificant are filtered out, our target has been to utilize all information 
contained in the weight matrix. The weighted RB model fulfills this criterion, as it 
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Figure 5. The sensitivity (a) and the specificity (b) of the modular structure with 
respect to the Forbes classification of business sectors [32] as a function of 7. The solid 
line is a guide to the eye. 




Figure 6. The number of modules (a), the sizes of the two largest modules (b), the 
sensitivity (c) and the specificity (d) as functions of r with the AFG method. The 
solid line is a guide to the eye. 



can equally well be applied to sparse and dense networks. In addition, it contains 
a parameter that allows tuning its resolution, which is useful for studies of nested 
community structures. Analysis of the resolution limit of the method has shown that for 
simple example cases, dense modular networks behave differently from sparse ones as the 
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resolution is only weakly dependent on the network size. As a practical application, we 
have used the method in analysis of the modular structure of a stock correlation matrix. 
Our results indicate that by varying the tuning parameter value, the method is able to 
detect modules which correspond to relevant business sectors, as well as substructure 
inside these modules. Thus it turns out that the weighted Potts method provides a 
feasible approach to community detection in dense networks. 
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